Comparative Modeling and Evaluation of CC-NUMA and COMA on Hierarchical Ring Architectures

نویسندگان

  • Xiaodong Zhang
  • Yong Yan
چکیده

Parallel computing performance on scalable shared-memory architectures is aaected by the structure of the interconnection networks linking processors to memory modules and on the eeciency of the memory/cache management systems. Cache Coherence Non-Uniform Memory Access (CC-NUMA) and Cache Only Memory Access (COMA) are two eeective memory systems, and the hierarchical ring structure is an eecient interconnection network in hardware. This paper focuses on comparative performance modeling and evaluation of CC-NUMA and COMA on a hierarchical ring shared-memory architecture. Analytical models for the two memory systems for comparative evaluation are presented. Intensive performance measurements on data migrations have been conducted on the KSR-1, a COMA hierarchical ring shared-memory machine. Experimental results support the analytical models, and we present practical observations and comparisons of the two cache coherence memory systems. Our analytical and experimental results show that a COMA system balances the work load well. However the overhead of frequent data movement may match the gains obtained from improving load balance. We believe our performance results could be further generalized to the two memory systems on a hierarchical network architecture. Although a CC-NUMA system may not automatically balance the load at the system level, it provides an option for a user to explicitly handle data locality for a possible performance improvement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Modeling and Evaluation of CC-NUMA and COMA on Hierarchical Ring rchitectures

Parallel computing performance on scalable share& memory architectures is affected by the structure of the interconnection networks linking processors to memory modules and on the efficiency of the memory/cache management systems. Cache Coherence Nonuniform Memory Access (CC-NUMA) and Cache Only Memory Access (COMA) are two effective memory systems, and the hierarchical ring structure is an eff...

متن کامل

ASCOMA: An Adaptive Hybrid Shared Memory Architecture

Scalable shared memory multiprocessors traditionally use either a cache coherent non uniform memory access CC NUMA or simple cache only memory architecture S COMA memory architecture Recently hybrid architectures that combine aspects of both CC NUMA and S COMA have emerged In this paper we present two improvements over other hybrid architectures The rst improvement is a page allocation algorith...

متن کامل

Excel-NUMA: Toward Programmability, Simplicity, and High Performance

ÐWhile hardware-coherent scalable shared-memory multiprocessors are relatively easy to program, they still require substantial programming effort to deliver high performance. Specifically, to minimize remote accesses, data must be carefully laid out in memory for locality and application working sets carefully tuned for caches. It has been claimed that this programming effort is less necessary ...

متن کامل

Exploring the Value of Supporting Multiple DSM Protocols in Hardware DSM Controllers

The performance of a hardware distributed shared memory (DSM) system is largely dependent on its architect’s ability to reduce the number of remote memory misses that occur. Previous attempts to solve this problem have included measures such as supporting both the CC-NUMA and S-COMA architectures in the same machine and providing a programmable DSM controller that can emulate any DSM mechanism....

متن کامل

Design alternatives for shared memory multiprocessors

In this paper, we consider the design alternatives available for building the next generation DSM machine (e.g., the choice of memory architecture, network technology, and amount and location of per-node remote data cache). To investigate this design space, we have simulated five applications on a wide variety of possible DSM architectures that employ significantly different caching techniques....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Parallel Distrib. Syst.

دوره 6  شماره 

صفحات  -

تاریخ انتشار 1995